Live-cell microscopy or fluorescence anisotropy with budded baculoviruses—which way to go with measuring ligand binding to M4 muscarinic receptors?

M4 muscarinic acetylcholine receptor is a G protein-coupled receptor (GPCR) that has been associated with alcohol and cocaine abuse, Alzheimer's disease, and schizophrenia which makes it an interesting drug target. For many GPCRs, the high-affinity fluorescence ligands have expanded the options for high-throughput screening of drug candidates and serve as useful tools in fundamental receptor research. Here, we explored two TAMRA-labelled fluorescence ligands, UR-MK342 and UR-CG072, for development of assays for studying ligand-binding properties to M4 receptor. Using budded baculovirus particles as M4 receptor preparation and fluorescence anisotropy method, we measured the affinities and binding kinetics of both fluorescence ligands. Using the fluorescence ligands as reporter probes, the binding affinities of unlabelled ligands could be determined. Based on these results, we took a step towards a more natural system and developed a method using live CHO-K1-hM4R cells and automated fluorescence microscopy suitable for the routine determination of unlabelled ligand affinities. For quantitative image analysis, we developed random forest and deep learning-based pipelines for cell segmentation. The pipelines were integrated into the user-friendly open-source Aparecium software. Both image analysis methods were suitable for measuring fluorescence ligand saturation binding and kinetics as well as for screening binding affinities of unlabelled ligands.

M 4 muscarinic acetylcholine receptor is a G protein-coupled receptor (GPCR) that has been associated with alcohol and cocaine abuse, Alzheimer's disease, and schizophrenia which makes it an interesting drug target. For many GPCRs, the high-affinity fluorescence ligands have expanded the options for high-throughput screening of drug candidates and serve as useful tools in fundamental receptor research. Here, we explored two TAMRA-labelled fluorescence ligands, UR-MK342 and UR-CG072, for development of assays for studying ligand-binding properties to M 4 receptor. Using budded baculovirus particles as M 4 receptor preparation and fluorescence anisotropy method, we measured the affinities and binding kinetics of both fluorescence ligands. Using the fluorescence ligands as reporter probes, the binding affinities of unlabelled ligands could be determined. Based on these results, we took a step towards a more natural system and developed a method using live CHO-K1-hM 4 R cells and automated fluorescence microscopy suitable for the routine determination of unlabelled ligand affinities. For quantitative image analysis, we developed random forest and deep learning-based pipelines for cell segmentation. The pipelines were integrated into the user-friendly open-source Aparecium software. Both image analysis methods were suitable for measuring fluorescence ligand saturation binding and kinetics as well as for screening binding affinities of unlabelled ligands. development of novel drugs targeting the M 4 receptor is difficult as the similarity of orthosteric binding sites of all mAChR leads to low subtype selectivity of ligands [9]. One solution is the development of allosteric modulators, which may exhibit higher subtype selectivity but relatively lower affinities [10]. To find suitable drugs, ligand screening remains an important step in the drug development process. Screening for new drug candidates using fluorescence methods has become quite popular due to several advantages over radioligand-based assays [11]. However, until now, only a limited number of fluorescence ligands have been available for mAChR, and to our knowledge, none have been extensively used to develop assays to study M 4 receptors [12][13][14]. Recently, several novel low molecular weight fluorescently labelled ligands targeting mAChRs were described [15,16]. Of these ligands, TAMRA labelled UR-CG072 and UR-MK342 have already been successfully used for studying M 2 receptors in NanoLuc luciferase bioluminescence resonance energy transfer (nanoBRET) and fluorescence anisotropy (FA) assays [17]. Even though these ligands show a slight preference for the M 2 receptor, they still have a high affinity for M 1 and M 4 receptors. Therefore, the new fluorescent ligands should be suitable as probes for studying M 4 receptors in drug candidate screening assays as well as in a large variety of fluorescence microscopy techniques from live tissue systems to single-molecule studies [18][19][20].
One of the most common options for characterizing fluorescent probe binding to proteins, including GPCRs, is the FA method [21][22][23][24][25]. For the successful development of FA assays, several unique aspects must be considered. Most importantly, FA is a ratiometric assay with its value depending on the ratio of bound and free ligand. Therefore, all experiments must be designed in a way that the probe and receptor concentrations are in a similar range, which means that ligand and receptor depletion should be taken into account [26]. The main advantage of the FA method is that there is no need to separate bound ligand from the free ligand, making it easy to continuously collect time-course data during ligand binding. These time-course data can be used to obtain kinetic parameters and to develop reaction kinetics models of ligand binding for more insight into the complicated regulation of signal transduction. In addition to cell membranes, budded baculovirus (BBV) particles can serve as a high-quality receptor source for FA assays. BBV particles are advantageous because they have a fixed cylindrical shape (approx. 50 nm × 300 nm) and homogeneous size distribution, resulting in minimal noise and small variability between replicates compared to membrane preparations [26][27][28]. Due to the small size and low sedimentation rate of BBV particles, they are well suited for performing homogeneous assays. However, downstream signalling cascades are not present in BBV particles. Furthermore, BBV particles are produced in Sf9 insect cells, where the membrane composition differs from mammalian systems.
Most of these problems can be avoided by using more natural live-cell assays for receptor display. Among multiple developed assays [29], NanoBRET has gained a lot of popularity in recent years due to its homogeneous format, the possibility of real-time measurements and relatively good compatibility with a wide array of fluorophores. However, it requires genetically modified receptors, which may have an influence on ligand binding and receptor activation [30]. Studying wild-type receptors is more difficult, as the receptor cannot be tagged, which in turn does not allow to take advantage of the high sensitivity of bioluminescence approaches. Further, the plate reader-based RET methods only provide cell population average statistics instead of single-cell resolution information, which may hide some important effects. One solution to both problems is flow cytometry, which can measure fluorescent ligand binding to individual cells. However, it cannot follow binding to a single cell over time nor spatially resolve from which part of the cell the fluorescence originates from [31]. By contrast, high-throughput microscopy can provide spatial information as well as timecourse information for the same cells, making more detailed analysis possible. On the downside, extracting pharmacologically relevant quantitative information from the bioimages requires more complex data analysis algorithms. However, once an automated data analysis solution with user-friendly software exists, it can be reused in future studies.
Microscopy methods open many possibilities for assay setup, but performing time-resolved measurements with the cellular resolution is not trivial and existing methods have several potential issues [32]. In previously published works, the kinetics of ligand binding to live cells in an high-throughput screening (HTS) compatible manner have only been analysed by the fluorescence intensity of the whole image [33,34]. For these methods, it is necessary to seed cells consistently as a high confluency monolayer, but this is either difficult or practically impossible to achieve with some cell lines [35]. Furthermore, it is much more difficult to identify individual cells from a dense monolayer, thus reducing the number of parameters that can be studied. In addition, dense monolayers can significantly affect physico-chemical environmental parameters such as oxygen concentration which can also have more direct effects on muscarinic receptor signalling [36]. For example, transient hypoxic conditions lead to increased phosphorylation of M 1 and M 2 receptors [37]. Finally, dense cell monolayers can easily cause focusing errors in automated microscopy, as some cells may have detached or formed a second layer.
A better approach was developed with HEK-293-D 3 R cells, which uses a machine-learning algorithm for detecting only the fluorescence intensity originating from cell membranes in equilibrium conditions and does not rely on dense monolayers [35]. However, ligand-binding kinetics were not analysed in that study. Nevertheless, kinetic measurements should be possible with a similar setup after adjusting the experimental design and the image analysis pipeline.
The most difficult steps of microscopy image analysis are usually cell detection and segmentation, which is necessary for robust quantification of the fluorescence signal. Approaches for these tasks have gone through a paradigm shift from classical computer vision techniques to machine learning and especially deep-learning (DL) methods. Deep neural networks dominate most of the developed benchmark datasets for general problems as well as bioimage analysis specifically [38][39][40]. A large number of DL architectures have been developed over the past few years, but their wide application can be limited by compatibility issues with popular image analysis software and too complex design for comprehensive understanding for life scientists [41][42][43][44][45][46][47][48]. Therefore, a widely supported and wellknown U-Net architecture is used in the present study for cell segmentation from bright-field images, as it has shown good results for similar microscopy images [43,49].
In this study, we developed new fluorescence-based ligandbinding assays for the M 4 receptor. These assays use two recently developed 5-carboxytetramethylrhodamine (5-TAMRA) labelled dibenzodiazepinone derivatives, UR-MK342 and UR-CG072 [16], and two different receptor sources. As both BBV particle-based FA and live cell-based microscopy assays have distinct advantages, we studied and compared the two options and discovered that both options are viable. To our best knowledge, this is the first detailed description of M 4 receptor fluorescence ligand-binding assays, which opens up many new possibilities to study these receptors.

Cell culture
Spodoptera frugiperda Sf9 (Invitrogen Life Technologies, Schwerte, Germany) cells were maintained as a suspension culture in serum-free insect cell growth medium EX-CELL 420 (Sigma-Aldrich) at 27°C in a non-humidified environment.
Cell culture viability and density were determined with an Automated Cell Counter TC20 (Bio-Rad Laboratories, Sundyberg, Sweden) by the addition of 0.2% trypan blue (Sigma-Aldrich). All experiments with CHO-K1-hM 4 R, CHO-K1 and Sf9 cell cultures were performed with passages 40-50, 33 and 2-25, respectively. Mammalian cell lines were tested and determined to be mycoplasma-negative.

Preparation of budded baculovirus particles
The human M 4 receptor in pcDNA3.1+ was purchased from cDNA Resource Center (www.cdna.org), and manufacturing and production of BBV containing human M 4 receptor were performed as described in [25] with some modification. For cloning M 4 into pFastBac vector, BamHI and XbaI sites were used with enzymes from (Thermo Fisher Scientific,Schwerte, Germany). To transform the bacmid into Sf9 cells, the transfection reagent FuGene 6 (Promega Corporation, Madison, USA) was used according to the manufacturer's protocol. After the viruses were generated and collected, the amount of infectious viral particles per ml (IVP/ml) for all the baculoviruses was determined with the image-based cell size estimation assay [52].
To produce the BBV particles, Sf9 cells were infected with multiplicity of infection (MOI) = 3 and incubated for 4 days (end viability of Sf9 cells was 55%). The supernatant, containing BBV particles, was gathered by centrifugation for 15 min at 1600 g. Next, the BBV particles were concentrated 40-fold by high-speed centrifugation (48 000 g at 4°C) for 40 min followed by washing with the assay buffer and homogenization with a syringe and a 30G needle. The suspension was divided into aliquots and stored at −90°C until the experiments. BBV particle preparations were done several times. Receptor concentration for the BBV particle stocks was estimated R stock_UR-CG072 = 9.7 ± 1.1 nM and R stock_UR-MK432 = 5.5 ± 0.7 nM, using the model described in [27].

Fluorescence anisotropy experiments
FA experiments were carried out on black flat bottom halfarea 96 well plates (Corning, Glendale, USA). A suitable combination of the fluorescent ligand, competitive ligand and BBV particle suspension was added to each well. Assay buffer was added so that the final liquid volume in each well was 100 µl.
In saturation binding experiments, two concentrations of fluorescent ligands were used, 2 nM and 20 nM for UR-CG072 and 1 nM and 6 nM for UR-MK342. For determination of non-specific binding, 2 µM or 20 µM UNSW-MK259 were used in the case of UR-CG072 and 1 µM or 6 µM scopolamine were used in the case of UR-MK342. Two-fold serial dilutions of BBV particle suspension was added starting from 60 µl. Wells without BBV particles were used as a free fluorescent ligand control.
For competition binding experiments, the concentrations of fluorescent ligands UR-CG072 and UR-MK342 were kept constant at 5 nM, and the volume of BBV particles was also kept constant at 20 µl (C final ≈ 1-2.2 nM). Five-or six-fold serial dilutions of the competitive ligands were used. Also, replicate wells with no competitive ligand were included, and for blank correction, replicate wells with only BBV particles was included. Measurements were carried out at 3 min intervals for 13-15 h at 27°C. A custom-made glass lid was used in all the experiments to minimize the evaporation from the wells. In all cases, BBV particles were added as the last component to initiate the ligand-binding process.
For kinetic experiments, 5 nM UR-CG072 or 6 nM UR-MK342 was used. In non-specific binding wells, 6 µM or 3 µM scopolamine was added, respectively. The reaction was initiated by the addition of 20 µl of M 4 receptor displaying BBV particles. After 180 min, the dissociation was royalsocietypublishing.org/journal/rsob Open Biol. 12: 220019 initiated by the addition of 2 µl of 300 µM (C final = 6 µM) or 150 µM (C final = 3 µM) scopolamine for UR-CG072 or UR-MK342, respectively. Two-microliters of assay buffer was added instead of the competitive ligand to association kinetics wells to maintain the equivalent volume in all wells.
In all experiments, the fluorescence intensity values were blank corrected for BBV particle autofluorescence by subtracting the respective parallel or perpendicular fluorescence intensity value of a blank well from the respective measurement well. The blank wells contained no ligands but only the same concentration of BBV particles as the measurement well.
FA measurements were performed with multi-mode plate reader Synergy NEO (BioTek Instruments, Winooski, USA), which is equipped with a polarizing 530(25) nm excitation filter and 590(35) nm emission filter allowing simultaneous parallelly and perpendicularly polarized fluorescence detection. At least three individual experiments were carried out in duplicate.

Microscopy of DiI stained CHO-K1-hM 4 R cells
CHO-K1-hM 4 R cells were grown as described above and seeded with a density of 25 000 cells per well into a µ-Plate 96 well Black plate (Ibidi, Gräfelfing, Germany) 5 h before the experiment. A stock solution of 1 mM 1,1 0 -dioctadecyl-3,3,3 0 ,3 0 -tetramethylindocarbocyanine perchlorate (DiI) (Invitrogen, Eugene, Oregon, USA) in DMSO stored at −20°C was thawed and sonicated in an ultrasound bath for 5 min to disrupt aggregates. Cell medium was removed and replaced with 200 µl per well of 2 µM DiI in Dulbecco's phosphate-buffered saline (DPBS) with Mg 2+ and Ca 2+ (Sigma-Aldrich) to stain the cell membranes. The cells were incubated with the solution for 10 min before imaging. The cells were imaged with Cytation 5 cell imaging multi-mode plate reader equipped with 20X LUCPLFLN objective (Olympus) from Bright-field and RFP channels (LED light source with excitation filter 531(40) nm and emission filter 593(40) nm for RFP channel (BioTek Instruments) with the following parameters for bright-field: LED intensity = 4, integration time = 110 ms, camera gain = 24 and for RFP fluorescence channel: LED intensity = 1, integration time = 71 ms, camera gain = 24. The cells were imaged in the montage mode (196 locations) with Z-stack (10 planes, 4 planes below and 5 planes above focus) to cover any imaging location-dependent variability and simulate potential autofocusing errors.

Live-cell ligand-binding imaging
CHO-K1-hM 4 R cells were seeded into µ-Plate 96 well Black plate (Ibidi) at densities of 25 000-55 000 cells per well in DMEM/F-12 medium and incubated for 5-7 h. Immediately before the measurement, the cell culture media was exchanged for the same cell culture media containing ligands. At all times, the well volume was kept at 200 µl.
For determining UR-CG072 affinity to the M 4 receptor, saturation binding experiments were carried out using twofold dilutions of UR-CG072 starting from 8 nM. Non-specific binding was measured in the presence of 3.7 µM scopolamine. The cells were incubated with ligands in Cytation 5 at 5% CO 2 and 37°C for 2 h before imaging.
For measuring UR-CG072 binding kinetics to M 4 receptor, 2 nM UR-CG072 was added to the cells, and imaging was immediately initiated. To achieve sufficient temporal resolution, only two wells were imaged in parallel. After approximately 3 h of association, 10 µl of 100 µM scopolamine (C final = 5 µM) was added to start dissociation.
The competition binding assay was performed using 2 nM UR-CG072. The different competitive ligand concentrations were pipetted to the plate in randomized order to avoid a correlation between well imaging order and concentration. It was determined that 2 h was sufficient to reach equilibrium for IC 50 value measurement as the IC 50 values for scopolamine and carbachol at 2 and 5 h remained constant within uncertainty limits.
The imaging was performed with Cytation 5 as described above. Saturation binding experiments were performed with following imaging parameters in bright-field: LED intensity = 4, integration time = 110 ms, camera gain = 24 and in RFP fluorescence channel LED intensity = 1 or 2, integration time = 827 ms, camera gain = 24. For kinetic binding assays all the parameters were the same except for RFP fluorescence channel LED intensity = 5. For competition binding assays the imaging parameters used in bright-field were: LED intensity = 5, integration time = 1222 ms, camera gain = 0 and in RFP fluorescence channel: LED intensity = 5, integration time = 613 ms, camera gain = 24 or the same as for kinetic experiments. The cells were imaged in the montage mode (4 locations per well) with Z-stack (10 planes, 4 planes below focal plane, 1 in focus and 5 planes above focal plane).

Cell segmentation with ilastik software
To develop a bright-field cell segmentation model based on the random forest (RF) algorithm, a total of three ilastik [53] pixel classification models were trained: RF-FL-1 (random forest-based fluorescence image cell segmentation), RF-BF-1 (random forest-based bright-field image cell segmentation model version 1) and RF-BF-2 (random forest-based brightfield image cell segmentation model version 2). Two of the models (RF-FL-1 and RF-BF-1) were intermediate helper models used for training the final RF-BF-2 model. Here, the models are named by combining the model type (RF or U-Net3), an input imaging modality that the model used for cell detection (BF for bright-field images and FL for fluorescence images), followed by the index of the model of the particular type. For developing the RF-FL-1 model, a set of fluorescence images of CHO-K1-hM 4 R cells stained with fluorescent lipophilic dye DiI was generated. Thirty of these images were randomly chosen from different locations of the well for the training set. The images were in-focus (10 images), 3 µm above (10 images) or 3 µm below (10 images) the focal plane to increase the model robustness against focusing errors. The Gaussian smoothing, Laplacian of Gaussian, Gaussian gradient magnitude, difference of Gaussians, structure tensor eigenvalues and Hessian of Gaussian eigenvalues features were selected for sigma values of 0.70, 1.00, 1.60, 3.50, 5.00, 10.00, 15.00 and 20.00 pixels. In addition, the Gaussian smoothing feature with a sigma value of 0.30 pixels was selected in the ilastik feature selection stage. RF-FL-1 was set up to perform binary pixel classification using cell and background classes. Some pixels of cells and background were manually annotated by adding annotations over the respective pixels of the in-focus images. More annotations were added at the fringe of cells to enhance the accuracy of the predictions. The annotations of the in-focus images were transferred to the respective out-of-focus images from the same field of royalsocietypublishing.org/journal/rsob Open Biol. 12: 220019 view. With these annotations, the RF-FL-1 was trained. Model export was set to generate simple binary segmentation. Then, the cells on the rest of the fluorescence images (186 images) were segmented in the batch processing mode creating a set of masks for 196 fields of view with the ten fields of view remaining in the training set. Next, the binary segmentation images were automatically reclassified into three classes: intracellular area (IC), membrane (MB) and near-membrane background (NMBG) with the rest of the pixels representing background (BG). IC class was generated by image erosion of the predicted cell masks by a 2-pixel radius disk structuring element. MB class was generated by image dilation of IC masks with a 3-pixel radius disk structuring element and pixels were assigned to the NMBG class by further image dilation of the MB images with a 7-pixel radius disk structuring element and excluding pixels already assigned to MB or IC classes. Next, a class balancing step was performed to obtain an equal number of pixels (MBs) for each of the classes. For that, all of the pixels from the class with the smallest MBs were selected and an equal MBs were selected randomly from IC and NMBG classes. The operation was performed for each image separately. Images generated by this process were considered as the ground truth for training RF-BF-1 model. RF-BF-1 was trained to detect cells from contrasted projections of bright-field Z-stacks. The Z-stack of bright-field images was converted into a single higher contrast image as described in [35]. Twenty fields of view were used as a training set in RF-BF-1 for the detection of IC, MB, NMBG areas from the contrast-enhanced bright-field image projections. The Gaussian smoothing, Laplacian of Gaussian, Gaussian gradient magnitude, difference of Gaussians, structure tensor eigenvalues and Hessian of Gaussian eigenvalues features were selected for sigma values of 0.70, 1.00, 1.60, 3.50, 5.00, 10.00, 15.00, 20.00, 25.00, 30.00 and 35.00 pixels. In addition, the Gaussian smoothing filter with a sigma value of 0.30 pixels was selected in the ilastik feature selection stage. Twenty ground truth images generated by RF-FL-1 were used as labels of IC, MB, NMBG classes in the respective images to train a model for the detection of three classes of pixels from the contrast-enhanced bright-field images. The prediction quality was estimated by the recall, precision, F 1 score and Matthews correlation coefficient (MCC) metrics as shown in table 1. Classification quality metrics were measured by considering the IC pixels to form a positive class while all other classes (MB, NMBG, BG) were merged to form the negative class. Thus, misclassifications of pixels between MB, NMBG and BG classes had no impact on the quality metrics.
Finally, the RF-BF-2 model was trained to improve the prediction quality of the RF-BF-1 model by adjusting the class balance by adding ground truth annotation to pixels that the RF-BF-1 model had failed to classify. The same training set of 20 contrast-enhanced bright-field images was used for the RF-BF-1 model. In this training run, the fourth class of pixels was created for BG from all the previously unclassified pixels. To create ground truth images for RF-BF-2, the class balancing step was performed again as previously described. Additionally, the labels were improved by manually adding pixels to each class, which the RF-BF-1 model had failed to classify. The same image features were used as in the RF-BF-1 model. The prediction quality of RF-BF-2 was evaluated with the test set (table 1 and figure 7g). By visual inspection, the addition of extra labels removed the largest and clearest misclassifications (figure 7f,g) and the ones remaining were overlapping with areas where the volume of training data was already large. As overall image detection parameters were not better for RF-BF-2 compared to RF-BF-1 (table 1), it was deemed that the model quality had reached a plateau, and further addition of data would not provide any significant model generalization. The model development pipeline is presented on figure 1a.

Cell segmentation with deep learning
For training the models for the DL pipeline, ten in-focus RFP fluorescence channel images of CHO-K1-hM 4 R cells stained with DiI dye were manually labelled using the ilastik pixel classification pipeline user interface. For that, pixels were classified as either cells or background. The manually generated annotations were exported. Next, a background correction step was used to remove systematic illumination differences from the fluorescence images. The ten images along with corresponding ground truth annotations were randomly sampled into training, validation and test sets as follows: six images in the training set, two images in the validation set and two images in the test set. The training and validation set images were cropped to the input size of the U-net (288 × 288 pixels) and augmented using a sequential augmenter with the augmentations (rescaling 0-5%, shearing 0-1 pixels, piecewise affine shearing 1-5%, random rotation ±45°, random left-right flip 50% probability and random up-down flip 50% probability) using the imgaug library [54]. A total of 6000 training tiles and 2000 validation tiles were generated (1000 augmented tiles of each image). The U-Net inspired fully convolutional U-Net3 architecture (figure 1c) was used to train a model, U-Net3-FL-1 (U-Net3 architecture-based fluorescence image cell segmentation), for cell detection from the fluorescence images [47,49]. The training was carried out using the following parameters: Adam optimizer [55], learning rate = 0.0002, beta 1 = 0.9, beta 2 = 0.999, epsilon = 10 −8 , number of epochs = 20, loss royalsocietypublishing.org/journal/rsob Open Biol. 12: 220019 function = binary cross-entropy. The validation set loss was confirmed to have reached a minimum within 20 epochs. The model quality was assessed for the test set images. Next, the model was used to predict the masks for 191 DiI labelled fluorescence images. These images were again separated into training, validation and test sets along with the corresponding in-focus bright-field images of the same fields of view (133, 29 and 29 images, respectively). The focal plane had been manually chosen in a prior step. As it has been previously shown that similar DL network architectures require considerably more bright-field data to converge to an optimal solution compared to fluorescence data, a different strategy was chosen for training DL for cell detection from bright-field images [40,49,56]. As the training data volume was substantially larger, a data generator was used for cropping the images to the correct size (288 × 288 pixels) instead of predefined training and validation sets. A batch size of eight images was used during training.
No augmentation was used for bright-field data. The same model architecture was used for the U-Net3-BF-1 (U-Net3 architecture-based bright-field image cell segmentation) model as for U-Net3-FL-1. In this training run, early stopping with patience = 20 was used, the model converged after 90 epochs. Also, learning rate reduction with a factor of 0.1 and patience = 10 was used. All other parameters were the same as for the fluorescence-based model. The final model U-Net3-BF-1 was used to predict the segmentation of the test set and equivalent metrics were calculated (table 1).
The model development pipeline is presented on figure 1b.

Image analysis pipeline
To carry out cell segmentation from all microscopy images, a suitable image analysis pipeline was developed. For using ilastik based models, the same pipeline was used as in [35] with minor modifications. The ilastik segmentation label Step b bright-field Z-stack quality mask   royalsocietypublishing.org/journal/rsob Open Biol. 12: 220019 index was updated according to the RF-BF-1 model design (segmentation label index = 2) and morphological corrections were not used as it was not necessary for whole-cell segmentation in contrast to contour segmentation. For using the U-Net3-BF-1 model for prediction, the MembraneTools module of Aparecium data and image analysis software (https://gpcr.ut.ee/aparecium.html) was updated to be able to use Keras framework [57] models for prediction. Unlike for RF-BF-1, for U-Net3-BF-1 only a single in-focus bright-field image was used for input instead of the contrast-enhanced image generated from bright-field Z-stacks. As the U-Net3-BF-1 model can predict only the 288 × 288-pixel patches, the bright-field images are tiled before prediction and the predictions are later stitched to original size images ( figure 1d, step a).
The quality mask was manually generated (figure 1d, stage 1) as previously described [35] and areas of low quality were removed from image quantification ( figure 1d, step b).
For fluorescence image quantification, the in-focus fluorescence image was selected from the Z-stack manually and the image intensity was calculated only for the areas detected as cells by the segmentation model.

Pharmacological data analysis
Aparecium 2.0 software was used to blank the raw parallel and perpendicular intensity values and calculate the FA values using the formula [59]: where I(t) II is the parallel fluorescence intensity and I(t) ? is the perpendicular fluorescence intensity at time point t. K i values were calculated with the Cheng-Prusoff equation [60], using the IC 50 values gained from data fitting with GraphPad Prism 5.0 (GraphPad Software, San Diego, USA) with a three-parameter logistic regression model (log(inhibitor) versus response).
For calculation of kinetic parameters k on , k off and K d_kinetic of the microscopy data, GraphPad Prism 5.0 'Association then dissociation' model was used. K d calculation form microscopy data was also done with GraphPad Prism 5.0, but the model used was 'one site-total and non-specific binding'.
For K d calculation from FA data a global model form [27], which takes ligand depletion into account was used. To calculate k on , k off and K d_kinetic from FA kinetic data a modified version of IQMTools/SBToolbox2 (IntiQuan, Basel, Switzerland) was used to fit FA values with the previously published model [17]. The model assumes four possible interactions: the interaction between the receptor (R) and the fluorescence ligand (L), the receptor and the competitive unlabelled ligand (C), non-specific binding sites from the receptor preparation (NBV) and fluorescent ligand, the interaction between non-specific binding sites on the microplate (N) and the fluorescent ligand. The corresponding reactions can be described by the following schemes: The concentrations in this model are connected to the predicted FA values through the equation: ð2:2Þ where {RL} t , {L} t , {NL} t and {NBVL} t are the instantaneous concentrations of RL, L, NL and NBVL, respectively, at timepoint t, and FA RL , FA L , FA NL and FA NBVL are the intrinsic fluorescence anisotropies of the RL, L, NL and NBVL states, respectively. All the uncertainties given are weighted standard error of the mean of at least three independent experiments if not stated otherwise.

Statistical analysis
For determining the quality of all machine-learning cell detection models, four metrics were considered: where true positive (TP) denotes the number of correctly detected pixels belonging to cells, true negative (TN) is the number of correctly predicted pixels not belonging to cells, false positive (FP) is the number of non-cell pixels detected as cells, false negative (FN) is the number of cell pixels detected as non-cell pixels.
To compare U-Net3-BF-1 and RF-BF-2 model qualities for determining IC 50 values from the live-cell microscopy assay, the R 2 of the nonlinear fits were compared in a pairwise manner using one-tailed Mann-Whitney U-test in GraphPad Prism 5.0 assuming that U-Net3-BF-1 is the superior model.
To determine the assay suitability for HTS applications, Z 0 values were calculated according to the formula [61]: To observe significant changes in FA signal it is necessary that the mole ratios of both the free and bound fluorescence ligand change when receptor concentration, total fluorescence ligand concentration, competitive ligand concentration, time, or a combination of these factors is varied. This is best achieved when concentrations of the probe and its target protein are kept close to their binding K d . For these reasons, only certain fluorescence ligands with suitable fluorophores and binding affinities, usually from low picomolar to low nanomolar ranges, are considered as probe candidates for FA assay. Two 5-TAMRA labelled ligands, UR-CG072 and UR-MK342 from [16], were chosen for the development of FA assays due to a suitable label and high affinity to M 4 receptor determined by radioligand binding to whole cells.
First, the saturation binding experiments were carried out to determine fluorescence ligand-binding affinities to M 4 receptors displayed on BBV particles. Both ligands showed similar and high binding affinity (K d_UR-CG072 = 3.6 ± 1.1 nM, K d_UR-MK342 = 1.2 ± 0.5 nM; figure 2) which are in good agreement with the radioligand binding values (K i_UR-CG072 = 3.7 ± 0.6 nM, K i_UR-MK342 = 0.97 ± 0.07 nM [16]). However, UR-MK342 binding has a larger dynamic range of FA values compared to UR-CG072. The same tendency was also found in FA assays with the M 2 receptor [17]. As the effect is evident for both receptor subtypes, it might be attributed to the more flexible linker in UR-CG072. Nevertheless, high affinity and sufficient dynamic range mean that both ligands would be suitable for kinetic measurements as well as using these as probes for measuring competitive ligand-binding parameters.
Next, the ligand-binding kinetics of UR-CG072 and UR-MK342 to the M 4 receptor were studied. In contrast to similar affinities, the kinetic properties of UR-CG072 and UR-MK342 were quite different (figure 3). The faster association and dissociation kinetics of UR-CG072 make it more suitable for FA-based screening assays as this allows increasing the assay throughput by shortening the incubation times which reduces problems concerning potential receptor source sedimentation, liquid evaporation or even degradation of the ligands or the receptor [62]. Faster kinetics is also beneficial for live-cell microscopy assays, where too long experiments can lead to problems with cell culture such as detachment and changes in medium composition.

Affinity screening a panel of MR ligands with UR-CG072 and UR-MK342
Fluorescence ligands are often applied for determining the affinities of unlabelled ligands. Therefore, the suitability of both ligands was studied as reporter probes in competition with unlabelled M 4 receptor ligands. For that, a panel of common M 4 receptor ligands was chosen such that the expected affinities would cover a wide range of values and contain both agonists and antagonists. In addition, some unlabelled ligands, which are structurally similar to the fluorescence ligands [50,51,63], were chosen to assess the assay's ability to work with dualsteric compounds. The set of ligands was investigated in competition binding experiments with both UR-CG072 and UR-MK342 ( figure 4). Both fluorescent ligands can successfully be used as reporter ligands with a high signal-to-noise ratio and very good Z-prime (Z 0 UR-CG072 = 0.52, Z 0 UR-MK342 = 0.67) making royalsocietypublishing.org/journal/rsob Open Biol. 12: 220019 the assay compatible with HTS formats which generally require a minimum Z 0 of 0.5. However, as UR-CG072 has faster kinetics than UR-MK342, a longer incubation time is needed to determine the competitive ligand affinity using UR-MK342. To avoid possible under-or overestimation of IC 50 values, it is important to wait until the equilibrium is reached [23].
To make the measurement values comparable, the pK i values were calculated from the IC 50 value for each ligand using the Cheng-Prussoff equation. While not all assumptions of the Cheng-Prussoff equation [60] are fulfilled, it has been previously shown that with these ligands the potential systematic error introduced by this operation is relatively small [17]. pK i values obtained from experiments using the two different reporter ligands correlated very well (R 2 = 0.96), and the linear regression slope of the obtained pKi values with both probes is very close to unity (0.97 ± 0.04) while the intercept is close to zero (0.3 ± 0.3) (figure 5a). This validates that both probes can be used to determine the unlabelled ligand affinities in the FA assay.
Out of the tested ligands, UNSW-MK259, which represents the non-labelled analogue of UR-CG072, had the largest deviation from the best regression line (figure 5). The reason for this deviation is unknown but may be connected to potential dualsteric binding modes of UNSW-MK259, UR-MK342 and UR-CG072, which could alter the binding mechanism. However, explaining this effect remains the topic of future studies.

Adjusting live-cell microscopy assay for measuring UR-CG072 binding to M 4 receptor
To keep the cells viable and with normal morphology during imaging experiments, it is necessary to maintain specific conditions, like 5% CO 2 , 37°C, and sufficient nutrient concentrations in the media. These parameters may start to royalsocietypublishing.org/journal/rsob Open Biol. 12: 220019 drift over long periods. Therefore, UR-CG072 was selected for the live-cell assay due to its faster binding and dissociation kinetics. First, it was confirmed that the binding of 2 nM UR-CG072 to CHO-K1-hM 4 R cells can be detected by fluorescence microscopy ( figure 6a1). For non-specific binding controls, a similar experiment was performed in the presence of 5 µM scopolamine. As illustrated in figure 6 there is a significant difference in fluorescence intensity between total binding (figure 6a1) and non-specific binding (figure 6b1).
To confirm that all the signal is specifically caused by ligand binding to M 4 receptors, the binding of 2 nM UR-CG072 to CHO-K1 cells not expressing M 4 receptor was measured. Under these conditions, there was no detectable accumulation of UR-CG072 to CHO-K1 cells (figure 6c1).
The results show that the differences between cell contour and cell body fluorescence intensities are smaller for the flatter CHO-K1 cells compared to HEK293 cells used in a previous study [35], which are elongated in the Z-direction. Therefore, it is necessary to analyse the fluorescence intensity of the whole cell body, which introduces an increased proportion of cell autofluorescence to the signal. Moreover, the imaging experiments were carried out in nutrient-rich cell culture media rather than DPBS buffer, as is suggested in [35]. This removes the need for more expensive special imaging media but increases background fluorescence levels. Combining these effects with using 5-TAMRA fluorophore instead of Cy3B as was used in [35], the overall signal level was greatly reduced in this assay compared to the previous one. However, as biological variability is still the main contributor to the assay uncertainty, reducing such variability at the cost of a reduced signal is still beneficial to the overall assay quality. royalsocietypublishing.org/journal/rsob Open Biol. 12: 220019 These images also reveal that the amount of M 4 receptors on the cell membrane surface in all cells is different and that in some cells ligand binding could not be detected at all ( figure 6 a1,a2). This aspect should be considered when moving on to single-cell-based quantification using this particular cell line. However, the current microscopy method averages the signal from a large number of cells and all cells used on a single assay day are seeded from the same population. Furthermore, the absolute intensity values have no direct or systematic influence on the calculated ligandbinding parameters (K d , k on , k off and K i ) and only lead to lowered signal-to-noise ratio. While a higher signal-to-noise ratio is beneficial in general, in the current case, it only has a limited impact on the overall measurement uncertainty.

Comparison of random forest and deep learningbased image analysis pipelines
Since the morphologies of HEK293 cells and CHO-K1 cells and the fluorescence probes are different, it was necessary to adjust the original pipeline previously developed for HEK293 cell analysis [35]. Due to a lower contour contrast of CHO-K1 cells compared to HEK293 cells, it was necessary to quantify the fluorescence intensity from the entire cell mask instead of only the cell contours.
A second adjustment to the original pipeline was needed due to the lower apparent brightness of the ligand-receptor complex. While the NAPS-Cy3B fluorescence signal in the dopamine D 3 receptor system was close to twice as high as the image background intensity [35], the signal of CHO-K1-hM 4 R bound UR-CG072 was only 4% above the background signal. Due to the high absolute signal level in the D 3 receptor system, it was not necessary to find the in-focus fluorescence plane in the original pipeline and instead the maximum intensity projection of the Z-stack could be used. In the current case, this approach is not suitable and leads to complete signal degradation (data not shown). Therefore, the fluorescence intensity must be quantified from the highest quality focal plane.
Improvements were also introduced into the model development and ground-truth generation process. The original pipeline relied on a human analyst to detect cell contours from bright-field images. Even though it was necessary to perform this step only once, it required significant manual labour and detecting cell contours from bright-field images is still more difficult compared to detection from fluorescence images. To address these issues, another approach was pursued. The cell membranes were stained with a lipophilic dye DiI and then imaged in both fluorescence and brightfield channels. For a small number of fluorescence images, cell masks were manually drawn. Next, machine-learning models RF-FL-1 and U-Net3-FL-1 were trained to generate cell masks from the DiI stained fluorescence images. These models were in turn used to predict the masks from a larger dataset of fluorescence images. The prediction masks then served as slightly lower quality, but significantly higher quantity ground truth for the next set of models (RF-BF-1, RF-BF-2 and U-Net3-BF-1), which predict the cell masks from bright-field images. The same conceptual approach was successful for training both the RF-based pipeline implemented in ilastik as well as the U-Net3 based DL pipeline developed using Jupyter notebooks [64] and Keras DL framework [57]. Considering all the aspects, both developed pipelines were superior to the original pipeline from the pipeline development perspective with a significantly reduced amount of manual annotation required.

Prediction quality comparison
The prediction quality of the DL models and ilastik based RF model were compared to determine the most suitable pipeline for analysis ( figure 7 and table 1). Visually, all models can segment most of the cells from the bright-field images with good quality. The main difference between the models is that U-Net3-BF-1 (figure 7h) produces cells with more consistent and smooth shapes, similar to the ground truth (figure 7b) while RF-BF-2 (figure 7g) creates rugged edges and also detects many small fragmented objects far from the cells. Numerically, the quality of bright-field detection (  royalsocietypublishing.org/journal/rsob Open Biol. 12: 220019 among other parameters. The F 1 score of the fluorescence image-based predictions of both U-Net3-FL-1 (figure 7d) and RF-FL-1 (figure 7c) are already substantially lower than unity. At the same time, the U-Net3-BF-1 model has only slightly lower quality metrics compared to U-Net3-FL-1 but RF-BF-2 has substantially lower metrics compared to RF-FL-1. This may indicate that a large proportion of the errors made by DL pipeline originates from the training of the fluorescence model U-Net3-FL-1 rather than the bright-field model U-Net3-BF-1. Interestingly, when comparing the U-Net3-FL-1 model predictions and the U-Net3-BF-1 model predictions directly to one another instead of comparing these to the manually generated ground truth, the corresponding F 1 score is 0.87. This is lower than the similarity between either the U-Net3-FL-1 model and manual ground truth (F 1 score = 0.91) or U-Net3-BF-1 and manual ground truth (F 1 score = 0.89). It means that the U-Net3-BF-1 model can surpass the prediction quality of the U-Net3-FL-1 model predictions in some instances while failing to do so in other cases. The ability of U-Net3-BF-1 to avoid at least some of the mispredictions generated by the U-Net3-FL-1 model could mean that the proposed strategy of bright-field model generation is likely to work even with relatively small manually annotated datasets without the risk of overfitting. Interestingly, the RF-FL-1 model has a higher F 1 score and MCC value compared to U-Net3-FL-1 model. However, these numbers should not be used to make conclusions about the general power of a particular machine-learning approach since the training sets for models were not identical. Different training sets were used for practical considerations. For example, the datasets were chosen to be small enough that would allow training of the models within a few hours and without the need for unconventionally large computational resources while still achieving sufficiently high quality.
Furthermore, analysing the competition, saturation and kinetic experiments, with both U-Net3-BF-1 and RF-BF-2 models provides the opportunity to compare the pipeline performances not only on the image level but also on the pharmacological level. As the most commonly used metric for fit quality, the R 2 values of the nonlinear model from each experiment were compared in a pairwise manner. The analysis revealed that the R 2 values obtained from the DL pipeline are statistically significantly higher compared to the RF pipeline ( p = 0.03) calculated as described in 'Material and methods'. U-Net3-BF-1 based cell detection had a higher average R 2 values (mean = 0.93 ± 0.05 and median = 0.939) compared to the RF-BF-2 based cell detection pipeline (mean = 0.89 ± 0.09 and median = 0.911). The relatively large standard deviation of the R 2 values shows that the algorithmic uncertainty is not the primary source of uncertainty, and instead, the variability is caused by biological factors. The high average R 2 values indicate that both pipelines work well in general, and the difference is not very large in absolute terms, but also that the small inaccuracies in the cell segmentation stage are not cancelled out during the post-processing steps. Instead, the errors are carried over and degrade the final fitting quality. Therefore, the U-Net3 based DL pipeline can still offer considerable advantages over RF-based approach at both image level and downstream nonlinear regression level. Thus, from the quality perspective, it is reasonable to prefer the DL pipeline with U-Net3-BF-1 over the RF pipeline using the RF-BF-2 model. As the U-Net3-BF-1 model showed higher overall quality, all the following presented results were obtained using the DL pipeline.

Usability of deep learning and ilastik pipelines for microscopy image analysis
In addition to model quality, the usability aspects of the developed pipelines were compared. The most relevant ones were general computational hardware requirements, pipeline speed, the convenience of using the pipelines in terms of user interfaces, and finally, the convenience of developing new machine-learning models in case of adapting the developed assay for a different microscope or cell line. It was identified that the speed of the ilastik based RF models is substantially slower compared to the U-Net3 based DL models used for analysing the microscopy images. The difference was especially evident in the case when a GPU (graphical processing unit) was used for computations, which considerably speeded up the DL models. A modern computer was able to analyse the results with both DL and ilastik pipelines in a comparable time for preparing an experiment or performing the imaging, thus, making the analysis quite manageable. On average, analysing a single 904 × 1224 pixel image took 12 s with RF pipeline and 3.5 s with DL pipeline.
Compared to spectroscopy methods, large data volumes generated by the microscopy experiments may cause storage issues. Therefore, before using the proposed microscopy methods, the user should make sure that sufficient memory is available for the experiments.
Another aspect to consider is the analysis convenience, which in the case of image analysis software is related to the need of manually adjusting the algorithm parameters and performing some of the image analysis, pre-processing, or post-processing steps manually. For both DL and ilastik pipelines, no manual parameter adjustment is needed removing one common obstacle in image analysis. In addition to choosing convenient machine-learning models, it was necessary to choose a suitable interface for using the machinelearning models and performing the pre and post-processing steps. Many such interfacing software tools such as FIJI (Dee-pImageJ [69]), CellProfiler [70] and ImJoy [71] allow almost unlimited flexibility for developing image analysis pipelines but also require that users have some knowledge of how image analysis pipelines work internally. These software currently also do not provide convenient out-of-the-box options for metadata handling required for pharmacological assays. Therefore, we chose Aparecium software (https://gpcr.ut. ee/aparecium.html) as the interfacing platform, as it is specifically designed for making image analysis pipelines as user-friendly as possible through graphical user interfaces (GUIs) while providing enough options for post-processing and metadata handling to carry out the biochemical analysis at the cost of less flexibility for general image analysis.
Finally, the aspect of machine-learning model development was considered as it is usually necessary to retrain the models from scratch or perform transfer learning if the method is used for widely different datasets [72,73]. In this study, two quite different model development environments were used. Model development in ilastik is relatively straightforward, requiring no programming skills and is done entirely through a GUI provided by the standalone ilastik software. Installing the software is very simple, and there are multiple tutorials available for using the GUI. Development of the DL models, including U-Net, is somewhat more difficult, requiring access to a python installation and preferentially to a royalsocietypublishing.org/journal/rsob Open Biol. 12: 220019 Jupyter notebook server. However, this process is significantly simplified thanks to the recently developed ZeroCostDL4Mic framework [72]. ZeroCostDL4Mic reduces the training process to a point-and-click level without the need to adjust the code. Therefore, both ilastik and DL image analysis pipelines are sufficiently simplified that model training does not require extensive past experience with ilastik being the simplest option. Therefore, ilastik pipeline and the RF model is recommended for machine-learning applications where ease of use is more important than a slight loss in quality. These practical considerations are quite dynamic as software tools develop and are likely to change in the future.

Determination of binding affinities with UR-CG072 to M 4 receptor in live-cell microscopy
For determining the binding affinity of UR-CG072, an assay design similar to the radioligand saturation binding experiment was used. From these data a K d of 2.85 ± 0.10 nM was obtained (figure 8), which is also in good agreement with all previous results (table 2). Interestingly, there is a small decline in non-specific binding with increasing concentration ( figure 8), but it is not of biological origin and is instead explained by a shadow-imaging effect which is caused by non-specific binding of UR-CG072 to the well surface, making the background brighter than the cells. This effect, however, does not interfere with the overall measurement, and the slope is not statistically significantly different from 0. A control saturation binding experiment with CHO-K1 cells not expressing M 4 receptors shows that there is no ligand binding to the wild-type CHO-K1 cells (electronic supplementary material, figure S1) and thus all the observed binding is to the M 4 receptors. Due to the good photostability of the 5-TAMRA label and the moderate kinetic rates of UR-CG072 binding, the k on and k off of UR-CG072 could be measured with the described live-cell system. The binding of UR-CG072 (figure 9) is fully reversible by the addition of 10 µM scopolamine after 3 h of association (indicated by the arrow). Moreover, the K d (2.6 ± 0.7 nM) obtained from kinetic data is in good agreement with previous values from both saturation binding assays as well as FA assays (table 2).
Lastly, competition binding experiments were carried out to confirm that the developed microscopy method is also suitable for screening novel unlabelled ligands in the future. Displacement curves were obtained for six ligands with varying structures, affinities and efficacies (figure 10). Regression analysis was used to obtain the IC 50 values from these data, which in turn were used to calculate pK i values of the unlabelled ligands (table 3).

Discussion
The M 4 receptor is connected to multiple diseases and is, therefore, an interesting target for drug development. In modern drug screening, the fluorescence-based methods have gained popularity, but the limited availability of fluorescent ligands for the M 4 receptor has significantly hindered studies of this receptor. Recently, a set of new dibenzodiazepinone-type fluorescent ligands with high affinity to the M 4 receptor was synthesized, of which UR-MK342 and UR-CG072 were labelled with TAMRA [16]. In previous studies, TAMRA label has been successfully used in FA assays, among other methods [90][91][92]. Therefore, these probes are promising candidates for developing new ligand-binding assays for the M 4 receptor. Experimental results from the FA assay show that both UR-MK342 and UR-CG072 bind to M 4 receptors with high affinity, and the K d values are in good agreement with previous radioligand binding measurements. Although both ligands also have sufficiently high signals and Z 0 values to be compatible with HTS assay standards, UR-CG072 is preferred in screening assays due to its faster binding kinetics, which allows reduction of required incubation time and mitigates the effects of evaporation and potential sedimentation.
Since UR-CG072 and UR-MK342 have previously been studied in the M 2 receptor FA assay system [17], similarities and differences in both receptor systems present an opportunity to gain more insight into their binding mechanism. Interestingly, the FA value of the receptor-ligand complex remains the same regardless of which receptor subtype, M 2 or M 4 , is measured. This similarity is evident for both fluorescent ligands. By contrast, the receptor-ligand complex FA value depends on the fluorescent ligand is used and is consistently lower for UR-CG072 compared to UR-MK342 in complexes with both M 2 and M 4 receptor subtypes. This may indicate that the binding poses and the rotational freedom of the fluorophore moiety are similar between the two subtypes.
There are also some differences in the binding properties of these probes between the FA assays of M 2 and M 4 receptors. Both ligands seem to show a somewhat higher affinity towards the M 2 receptor, but the differences are relatively small [17]. This is expected as orthosteric binding sites of M 2 and M 4 are structurally very similar [9]. However, there could be differences in the binding site accessibility since the association kinetics of both probes to the M 2 receptor are faster compared to the M 4 receptor. This is not surprising as ligand binding to muscarinic receptors is known to be a complex process even for somewhat smaller ligands. receptor subtypes [93]. The most striking difference between ligand binding to M 2 and M 4 receptors is the apparent lack of a clear two-phase kinetic behaviour in the case of the M 4 receptor while it is present for the M 2 receptor. This could indicate that M 2 and M 4 receptor systems have differences beyond the orthosteric binding site properties as the two-phase behaviour of the M 2 receptor-ligand binding was not specific to FA or BBV system but was also present in nanoBRET assay with mammalian cells [17]. This may indicate heterogeneity of the M 2 receptor population, where receptors with multiple affinity states are present while the apparent heterogeneity is also ligand-dependent. For the M 4 receptor, such heterogeneity was not observed with the ligands used in this study. The nature of this heterogeneity remains elusive but may be explained by the simultaneous existence of M 2 receptor dimers and monomers or ligand interactions with M 2 receptor allosteric sites. Dimerization of the M 2 receptor is also supported by multiple previous studies while there is no information available about M 4 receptor dimerization [94][95][96].
The FA method with BBV particles has many advantages, such as kinetics measurement possibilities, relatively low cost, fast measurements and receptor source stability, which is achieved by using a single production batch of the BBV particle stock. Therefore, the FA-based assay is a suitable option for HTS applications. However, there are also several differences between the BBV particle model system and in vivo or ex vivo conditions. For example, the live-cell systems allow studying G-protein and β-Arrestin signalling and other protein-protein interactions. Furthermore, cholesterol in the membrane has an effect on ligand binding to muscarinic receptors [97], thus using live mammalian cells allows obtaining more relevant measurement results. Therefore, a live cellbased assay system, although still having notable differences from in vivo systems, is a significant step closer to native systems. Live-cell assays also have some general disadvantages, such as slightly higher cost per experiment due to more advanced equipment required to perform the measurements and maintain cell culture. Additionally, live-cell measurements usually have higher uncertainty due to day-to-day variability. It must also be considered that the live-cell systems, which overexpress the receptors, do not fully reflect the natural system and may lead to considerable biases.
The results show that receptor-ligand complex formation on the surface of live cells can be studied by automated fluorescence   royalsocietypublishing.org/journal/rsob Open Biol. 12: 220019 microscopy, which allows relatively fast measurements and high content spatio-temporal data collection. In addition, microscopy images can be used to study cell morphology, fluorophore localization, cell migration and cell death, which cannot be easily achieved with flow cytometry nor nanoBRET based measurement systems. These parameters can be useful for studying GPCR signalling [98]. Although a wide variety of advanced microscopy techniques allow measuring these parameters in great detail, high-end microscopy is often not automatic and, therefore, not suitable for high content studies. By contrast, automatic plate reader-based microscopes achieve a unique balance between the data volume and quality. This kind of automated live-cell microscopy has previously been used to study ligand binding to dopamine D 3 receptors [35]. In the present study, this method was further developed to enable the quantification of receptor-ligand binding in both equilibrium and kinetic modes. Faster kinetics of UR-CG072 compared to UR-MK342 favour using it in live-cell assays as shorter experiment times avoid negative effects such as cell detachment, changes in nutrient and oxygen concentration and cell death. Although microscopy methods also pose some challenges related to data volumes, data analysis speed and data analysis pipeline usability, the results of this study show that suitable software and machine-learning models overcome these problems. The model comparison shows that while DL pipeline provides higher quality results, the ilastik pipeline models are easier to retrain. As the final pharmacological parameters obtained with U-Net3-BF-1 and RF-BF-2 models are similar, with an average LogIC 50 difference of 0.15 units between models from an individual displacement curve, then both options are viable in practice depending on the needed quality and user's level of expertise. A unique challenge with machinelearning-based image analysis pipelines is the need to retrain the models if a sufficiently large domain shift is introduced into the assay, such as changing the cell line or microscopy setup. Fortunately, this has to be done only once for a particular assay setup, and easy-to-use options exist for retraining the models. Altogether, the data analysis is not a limiting factor of the proposed live-cell assay.
Using the described live-cell microscopy approach combined with machine-learning-based data analysis allowed measuring ligand binding to M 4 receptor with high quality. It is important to mention that such quality can be achieved with the fluorescence signal of bound UR-CG072 being only 4% above the background, which is substantially less than almost 200% achieved with the human embryonic kidney 293 cells expressing dopamine D 3 receptors (HEK293-D 3 R) system in a previous study [35]. This lower signal is caused by a combination of multiple factors. First, TAMRA fluorophore used in UR-CG072 has a lower quantum yield compared to Cy3B used in NAPS-Cy3B ligand. Second, the M 4 receptor is not expressed in all CHO-K1-hM 4 R cells, while the D 3 receptor was expressed in HEK293 cells. As a final factor, using the cell culture medium instead of DPBS during imaging increases the image background intensity. Surprisingly, the reduction of the absolute signal by a factor of 50 does not affect the final uncertainty of the measurements to any significant extent. The obtained R 2 values for both saturation binding and displacement experiments are very similar for both D 3 and M 4 receptor microscopy assays. Essentially, it means that biological variability is the highest contributor to the total uncertainty while decreased signal has negligible uncertainty contribution. This, in turn, means that this kind of assay design should work just as well with either relatively low quantum yield fluorophores or vice-versa with systems that have receptor expression more comparable to physiological expression levels if a high brightness probe is available. Thus, the proposed approach to study ligand binding to receptors has a much wider application range than previously demonstrated. Finally, the current results prove the universality of this kind of microscopy assay, as switching to another receptor and cell line did not require major changes to the analysis pipeline or assay protocol.
The developed live-cell microscopy assay can be performed in the saturation binding mode, association and dissociation kinetic modes as well as in displacement experiments for measuring the affinity of unlabelled ligands. The kinetic measurements show that the fluorescence signal is quite stable once the equilibrium is reached after the association royalsocietypublishing.org/journal/rsob Open Biol. 12: 220019 phase. It is also evident that scopolamine induces full displacement of UR-CG072 from the M 4 receptor as the signal reaches the same level as was in the starting point ( figure 9). Although the signal does not reach zero after dissociation, this is caused by autofluorescence, not by incomplete dissociation. UR-CG072 also has sufficiently fast kinetics for performing association and dissociation kinetics so that the morphology of CHO-K1-hM 4 R cells remains normal and the cells remain attached to the plate for the entire experiment. Both the kinetic measurements and saturation experiments prove that UR-CG072 retains its high affinity towards the M 4 receptor in the live-cell system, as expected from previous radioligand binding studies [16], while having a very low level of non-specific binding to the cells. This makes UR-CG072 a promising fluorescent probe also for more advanced microscopy methods such as live-cell total internal reflection fluorescence (TIRF) microscopy. The displacement curves obtained with UR-CG072 and unlabelled ligands have quite high quality and, therefore, this assay is suitable for the determination of affinities of unlabelled ligand binding to M 4 receptor. The system remains stable for the duration of long experiments meaning that accurate endpoint measurements can be obtained for an entire microplate even if imaging the full plate is not instantaneous. These properties also suggest that the assay can be used for small scale screening of novel ligands, for example, to confirm binding affinities in a live-cell system. The live-cell system is also internally consistent as the K d values obtained from saturation binding measurements, and kinetic measurements are in excellent agreement. Overall, the K d values of UR-CG072 obtained from both saturation and kinetic FA and live-cell microscopy assays are in good agreement with each other (table 2 and figure 11). pK i values of M 4 receptor ligands determined with the UR-CG072 using either FA or live-cell microscopy assay, were also in good agreement (R 2 = 0.91). The slope of the correlation was 0.84, while the intercept was 2.2. The live-cell method systematically estimates higher affinities for low-affinity ligands, while for high-affinity ligands in the nanomolar range, the estimated values are numerically more similar between the assays ( figure 11). However, most low-affinity ligands are agonists, while high-affinity ligands are antagonists. Therefore, it is difficult to determine whether there is a systematic difference between assays for low-affinity ligands or simply agonists. Agonism causing the systematic difference is theoretically well-founded, as the high-affinity receptor state is usually stabilized by G-proteins, which are not present in the BBV particles. A similarly good correlation was previously found between nanoBRET assay and FA assay using the same probe with M 2 receptor (R 2 = 0.94) with the same systematic differences between the pK i values measured in BBV particles and live cells [17]. This further supports that the systematic difference between the determined agonist pK i values is caused by differences between BBV particle and live-cell systems.
The developed live-cell microscopy assay can be modified for wider applications in the future. One development direction is further automatization of the assay by removing the remaining manual steps from the data analysis process. This could also include an even more standardized pipeline for machine-learning model development or a larger set of pretrained models that cover the detection of the most common cell lines. We believe that it can be further developed to an extent to which the live-cell microscopy could also be used in an HTS context. Another development direction is a shift towards more natural systems such as tissue preparations, live tissues or tumour spheroids and measuring additional downstream signalling events in addition to ligand binding. Using these more challenging systems requires finding suitable fluorophores to overcome tissue autofluorescence and ligands with suitable kinetic properties to slow down fluorescence ligand dissociation during washing steps. Additionally, more advanced DL models may be necessary. The present study serves as a solid foundation for such developments.
As for the more general unlabelled ligand screening, both FA and live-cell microscopy methods and fluorescence ligands could be used as rapid and convenient options for guiding the synthesis of novel M 4 receptor ligands and allosteric modulators. Both methods also allow for kinetic measurements, which may help uncover more detailed binding mechanisms. Overall, choosing the suitable method for a specific experiment highly depends on the required throughput and availability of equipment. While FA royalsocietypublishing.org/journal/rsob Open Biol. 12: 220019 method with BBV particles fulfils many requirements of HTS applications, live cells are a vastly more flexible option for studying complex signalling pathways. Therefore, live-cell microscopy-based ligand-binding assays are likely to have an ever-growing role in future of ligand-binding studies.
Data accessibility. The data that support the findings of this study are openly available from the repository of the University of Tartu. UT-GPCR001 microscopy data of ligand binding to M4 muscarinic receptor in live CHO-K1-hM4 cells: http://dx.doi.org/10.23673/ re-306.
Electronic supplementary material is available online [99].